Atom is better than RSS,
in ways that matter
Draft: started • Tagged /atom, /rss
My feeds are Atom feeds.
Everyone with knowledge agrees that Atom is technically superior to RSS.
Everything should have switched to it twenty years ago.
All feed readers support Atom just fine (except in podcasts, for no good reason).
Unfortunately, many people continued to write and choose RSS,
whether from ignorance of Atom (please stop calling feeds “RSS”!),
or from figuring it doesn’t really matter in the end.
Most of the differences are surface-level, or in practice not a problem:
- RSS uses a stupid date format? Doesn’t really affect people much.
- RSS doesn’t specify content encoding, so it’s ambiguous as to whether
<description> is text or HTML?
Used to be a problem, but these days everyone emits and assumes HTML.
In fact, RSS 2.0 specs settled on it being HTML,
though people often use the unreasonably-popular, wonkily-versioned, horribly-named
content:encoded to mean that, even though that’s not what it was supposed to mean. - RSS is a travesty and the name refers to nine mutually-incompatible formats, published by a variety of different and competing organisations sometimes even using the same version numbers?
Meh, just ignore the XML namespaces, be sloppy,
the feeds you want to cope with will be worse.
Resign yourself, it’ll be easier in the long run.
Seriously, you may even be able to get away with forgetting that RSS is supposed to be XML.
But what few realise is that some of the differences matter.
That due to inconsistent treatment and usage,
some reasonable content cannot reliably be expressed in RSS,
whereas it’s unambiguous in Atom and should always work fine
(and if it doesn’t, it’s unambiguously a bug).
So let’s talk about the cases that matter.
I would like to convince people to prefer Atom for useful reasons,
rather than merely ideological purity.
I want RSS dead. Title encoding semantics
Précis: you can’t reliably use characters like < or & in RSS titles.
Don’t even try mentioning HTML tags in titles.
Content management systems normally allow headings to contain markup;
but they seldom allow titles to contain markup.
This is a tragedy. Many an article calls for <code> or <em>.
I do it often, and I’ve seen a few others do it, but not many.
Atom defines something called text construct,
where you can specify whether the value is text,
entity-encoded HTML, or XML-encoded HTML.
Titles are text constructs.
In RSS… well, not even content gets encoding semantics,
which used to be a real problem,
but over time everyone settled on “it’s always HTML”.
So titles definitely don’t get encoding semantics.
And different implementations do different things.
Suppose you want to encode this title:
All about the <xmp> element
How are you going to do it?
In Atom, you have three clear choices:
- Discard the markup, and encode the text as text.
<title type="text">
All about the &lt;xmp> element
</title>
- Keep the markup, and entity-encode the HTML.
<title type="html">
All about the &lt;code>&amp;lt;xmp>&lt;/code> element
</title>
- Keep the markup, and represent the HTML as XML.
<title type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
All about the <code>&lt;xmp></code> element
</div>
</title>
TODO: survey RSS readers to find out how they handle all of these cases.
(This is the biggest reason I haven’t published this previously. Probably should have by 2020 or so.)
Hard mode challenge: <_>::v::<_>, a real title I used.
The summary/full item distinction
RSS bad, Atom good. Capiche? 😁
Heuristics, bad for interop, &c. &c.
I feel like there was a third Difference that Matters that I had in mind at some point. Maybe I’ll remember it before I publish. The fly in the ointment: podcasts
I must, in good faith, mention podcasts.
Although approximately every single feed reader and open-source feed-reading library
from the last twenty years supports both RSS and Atom,
most of the largest podcast feed readers or syndicators only support RSS.
Since this stuff is largely from 20–25 years ago,
and I wasn’t paying attention to the space at the time (few twelve-year-olds would),
details are hazy.
Rough timeline:
Podcasting started to be a thing.
RSS 2.0 added enclosures for it.
Atom was designed, to fix most of the problems with RSS.
It stabilised.
Apple released iTunes podcasting stuff only supporting RSS.
The iTunes client got Atom support, but the iTunes Music Store did not.
Atom was published as RFC 4287.
I have found no clear evidence that the iTunes Music Store
(later the Apple Podcasts catalogue) ever supported Atom.
And whatever iTunes/Apple Podcasts ever did have, was removed in 2023.
So I want to blame Apple.
They controlled the space in the critical early days, and they ruined it.
Some podcast readers or syndicators support Atom.
Some claim to but don’t.
Some (probably the most important ones) just don’t.
RSS is a mess. Podcast RSS is even more of a mess.
Frozen in time, hack is piled upon hack to work around RSS’s shortcomings.
Ugh.